42 research outputs found

    Impact of In-Service Training on Performance of Teachers A Case of STEVTA Karachi Region

    Get PDF
    Learning which takes place in a classroom is significantly associated with teachers and their actions taken in the classroom. Therefore, quality of education can be improved by putting more focus on teaching methodologies and the way teachers spend time in classrooms. This study aimed at examining the impact of in-service training on the performance of the teachers. It is generally believed that with the implementation of certain in-service training programmes the performance of teachers regarding their professional skills, knowledge and experience can be signif icantly improved. The target population of the present study included the in-service teachers offering their services at Sindh Technical Education & Vocational Training Authority (STEVTA), Government of Sindh, Karaschi Region. Using close-ended questions, perception and experience of teachers (n=150, m=100, f=50), who availed the opportunity to get in-service training, were gained. Findings of the study revealed the positive impact of in-service training programmes on the performance of teachers. The study also revealed the positive perception of teachers regarding their professional growth. It recommended the in-service training programmes to be introduced in line with the subject rather than general

    Urdu AI: writeprints for Urdu authorship identification

    Get PDF
    This is an accepted manuscript of an article published by ACM in ACM Transactions on Asian and Low-Resource Language Information Processing on 31/10/2021, available online at: https://doi.org/10.1145/3476467 The accepted version of the publication may differ from the final published version.The authorship identification task aims at identifying the original author of an anonymous text sample from a set of candidate authors. It has several application domains such as digital text forensics and information retrieval. These application domains are not limited to a specific language. However, most of the authorship identification studies are focused on English and limited attention has been paid to Urdu. On the other hand, existing Urdu authorship identification solutions drop accuracy as the number of training samples per candidate author reduces, and when the number of candidate author increases. Consequently, these solutions are inapplicable to real-world cases. To overcome these limitations, we formulate a stylometric feature space. Based on this feature space we use an authorship identification solution that transforms each text sample into point set, retrieves candidate text samples, and relies the nearest neighbour classifier to predict the original author of the anonymous text sample. To evaluate our method, we create a significantly larger corpus than existing studies and conduct several experimental studies which show that our solution can overcome the limitations of existing studies and report an accuracy level of 94.03%, which is higher than all previous authorship identification works

    A scalable framework for stylometric analysis query processing

    Get PDF
    This is an accepted manuscript of an article published by IEEE in 2016 IEEE 16th International Conference on Data Mining (ICDM) on 02/02/2017, available online: https://ieeexplore.ieee.org/document/7837960 The accepted version of the publication may differ from the final published version.Stylometry is the statistical analyses of variationsin the author's literary style. The technique has been used inmany linguistic analysis applications, such as, author profiling, authorship identification, and authorship verification. Over thepast two decades, authorship identification has been extensivelystudied by researchers in the area of natural language processing. However, these studies are generally limited to (i) a small number of candidate authors, and (ii) documents with similar lengths. In this paper, we propose a novel solution by modeling authorship attribution as a set similarity problem to overcome the two stated limitations. We conducted extensive experimental studies on a real dataset collected from an online book archive, Project Gutenberg. Experimental results show that in comparison to existing stylometry studies, our proposed solution can handlea larger number of documents of different lengths written by alarger pool of candidate authors with a high accuracy.Published versio

    Towards a better understanding of Tarajem: creating topological networks for Arabic biographical dictionaries

    Get PDF
    Biographical writing is one of the earliest and most extensive forms of Arabic literature. Some scholars tend to assume that classical Arabic biographies, widely known as Tarāǧim, arose in conjunction with the study of the reliability of the Hadith transmitters (the reciters of the Prophet Mohammad's sayings) which lead to a proliferation of biographical material collected and used to assess the transmitter's trustworthiness . However, a scrutiny of the well-known classical Arabic biographical dictionaries such as Siyaru 'A`lāmi an-Nubalā' `The Lives of the Noble Figures' for Adh-Dhahabī shows that they extend their entries to other classes of persons important to the development of particular fields such as Islamic jurisprudents, rulers, poets, philosophers or physicians. The main contribution of Arabic biographical dictionaries is the cumulative value of the thousands of life histories which construct a picture of the Islamic society in different eras. An Arabic biographical dictionary, therefore, is predominantly used by scholars to look up an eminent person's achievements and historical background. In this project, however, we explore Arabic biographies as a prosopography, rather than a biography in the strict sense. We introduce a novel method for a better understanding of Arabic biographical dictionaries by creating a network of relations among different persons. We utilise Natural Language Processing (NLP) tools to create a topological network from the unstructured data of 45,500 biographical entries collected from different dictionaries. We aim to illustrate how network analysis leveraged by NLP tools can provide scholars with innovative methods for discovering complex constellation of relations between prominent and non-prominent figures spanning over several eras and from different fields of knowledge. We also use graph visualisation as a means to effectively communicate and explore such complex constellations. Each network visualisation is purposefully designed to be as simple and robust as possible to offer scholars a way to move relatively fluidly between the large scale of biographical entries and to easily interpret the minute ties between persons of different walks of life. We make both our data and code publicly available for researchers to replicate the experiment. It can be found at:https://github.com/sadanyh/Relational-Network-for-Arabic-Taraje

    Forecasting tax revenues using time series techniques – a case of Pakistan

    Get PDF
    The objective of this research was to forecast the tax revenue of Pakistan for the fiscal year 2016–17 using three different time series techniques and also to analyse the impact of indirect taxes on the working class. The study further analysed the efficiency of three different time series models such as the Autoregressive model (A.R. with seasonal dummies), Autoregressive Integrated Moving Average model (A.R.I.M.A.), and the Vector Autoregression (V.A.R.) model. In any economy, tax analysis and forecasting of revenues is of paramount importance to ensure the economic and fiscal policies. This study is important to identify significant variables affecting tax revenue specifically in Pakistan. The data used for this paper was from July 1985 to December 2016 (monthly) and focused on forecasting for 2017. For the forecasting of total tax revenue, we used components of tax revenues such as direct tax, sales tax, federal excise duty and customs duties. The results of this study revealed that among these models the A.R.I.M.A. model gives better-forecasted values for the total tax revenues of Pakistan. The results further demonstrated that major tax revenue is generated by indirect taxes, which cause more inflation that directly hits the working class of Pakistan

    Native language identification of fluent and advanced non-native writers

    Get PDF
    This is an accepted manuscript of an article published by ACM in ACM Transactions on Asian and Low-Resource Language Information Processing in April 2020, available online: https://doi.org/10.1145/3383202 The accepted version of the publication may differ from the final published version.Native Language Identification (NLI) aims at identifying the native languages of authors by analyzing their text samples written in a non-native language. Most existing studies investigate this task for educational applications such as second language acquisition and require the learner corpora. This article performs NLI in a challenging context of the user-generated-content (UGC) where authors are fluent and advanced non-native speakers of a second language. Existing NLI studies with UGC (i) rely on the content-specific/social-network features and may not be generalizable to other domains and datasets, (ii) are unable to capture the variations of the language-usage-patterns within a text sample, and (iii) are not associated with any outlier handling mechanism. Moreover, since there is a sizable number of people who have acquired non-English second languages due to the economic and immigration policies, there is a need to gauge the applicability of NLI with UGC to other languages. Unlike existing solutions, we define a topic-independent feature space, which makes our solution generalizable to other domains and datasets. Based on our feature space, we present a solution that mitigates the effect of outliers in the data and helps capture the variations of the language-usage-patterns within a text sample. Specifically, we represent each text sample as a point set and identify the top-k stylistically similar text samples (SSTs) from the corpus. We then apply the probabilistic k nearest neighbors’ classifier on the identified top-k SSTs to predict the native languages of the authors. To conduct experiments, we create three new corpora where each corpus is written in a different language, namely, English, French, and German. Our experimental studies show that our solution outperforms competitive methods and reports more than 80% accuracy across languages.Research funded by Higher Education Commission, and Grants for Development of New Faculty Staff at Chulalongkorn University | Digital Economy Promotion Agency (# MP-62-0003) | Thailand Research Funds (MRG6180266 and MRG6280175).Published versio

    Impact of In-Service Training on Performance of Teachers A Case of STEVTA Karachi Region

    Get PDF
    Learning which takes place in a classroom is significantly associated with teachers and their actions taken in the classroom. Therefore, quality of education can be improved by putting more focus on teaching methodologies and the way teachers spend time in classrooms. This study aimed at examining the impact of in-service training on the performance of the teachers. It is generally believed that with the implementation of certain in-service training programmes the performance of teachers regarding their professional skills, knowledge and experience can be signif icantly improved. The target population of the present study included the in-service teachers offering their services at Sindh Technical Education & Vocational Training Authority (STEVTA), Government of Sindh, Karaschi Region. Using close-ended questions, perception and experience of teachers (n=150, m=100, f=50), who availed the opportunity to get in-service training, were gained. Findings of the study revealed the positive impact of in-service training programmes on the performance of teachers. The study also revealed the positive perception of teachers regarding their professional growth. It recommended the in-service training programmes to be introduced in line with the subject rather than general

    Tumor necrosis factor -α, interleukin-10, intercellular and vascular adhesion molecules are possible biomarkers of disease severity in complicated Plasmodium vivax isolates from Pakistan.

    Get PDF
    Background: Cytokine-mediated endothelial activation pathway is a known mechanism of pathogenesis employed by Plasmodium falciparum to induce severe disease symptoms in human host. Though considered benign, complicated cases of Plasmodium vivax are being reported worldwide and from Pakistan. It has been hypothesized that P.vivax utilizes similar mechanism of pathogenesis, as that of P.falciparum for manifestations of severe malaria. Therefore, the main objective of this study was to characterize the role of cytokines and endothelial activation markers in complicated Plasmodium vivax isolates from Pakistan. Methods and Principle Findings: A case control study using plasma samples from well-characterized groups suffering from P.vivax infection including uncomplicated cases (n=100), complicated cases (n=82) and healthy controls (n=100) were investigated. Base line levels of Tumor necrosis factor-α (TNF-α), Interleukin-6 (IL-6), Interleukin-10 (IL-10), Intercellular adhesion molecule-1 (ICAM-1), Vascular adhesion molecule-1(VCAM-1) and Eselectin were measured by ELISA. Correlation of cytokines and endothelial activation markers was done using Spearman’s correlation analysis. Furthermore, significance of these biomarkers as indicators of disease severity was also analyzed. The results showed that TNF-α, IL-10, ICAM-1and VCAM-1 were 3-fold, 3.7 fold and 2 fold increased between uncomplicated and complicated cases. Comparison of healthy controls with uncomplicated cases showed no significant difference in TNF-α concentrations while IL-6, IL-10, ICAM-1, VCAM-1 and E-selectin were found to be elevated respectively. In addition, significant positive correlation was observed between TNF-α and IL-10/ ICAM-1, IL-6 and IL-10, ICAM-1 and VCAM-1.A Receiver operating curve (ROC) was generated which showed that TNF-α, IL-10, ICAM-1 and VCAM-1 were the best individual predictors of complicated P.vivax malaria. Conclusion: The results suggest that though endothelial adhesion molecules are inducible by pro-inflammatory cytokine TNF-α, however, cytokine-mediated endothelial activation pathway is not clearly demonstrated as a mechanism of pathogenesis in complicated P.vivax malaria cases from Pakistan

    Domain adaptation of Thai word segmentation models using stacked ensemble

    Get PDF
    © 2020. Published by ACL. This is an open access article available under a Creative Commons licence. The published version can be accessed at the following link on the publisher’s website: https://www.aclweb.org/anthology/2020.emnlp-main.315/Like many Natural Language Processing tasks, Thai word segmentation is domain-dependent. Researchers have been relying on transfer learning to adapt an existing model to a new domain. However, this approach is inapplicable to cases where we can interact with only input and output layers of the models, also known as “black boxes”. We propose a filter-and-refine solution based on the stacked-ensemble learning paradigm to address this black-box limitation. We conducted extensive experimental studies comparing our method against state-of-the-art models and transfer learning. Experimental results show that our proposed solution is an effective domain adaptation method and has a similar performance as the transfer learning method

    Handling cross and out-of-domain samples in Thai word segmentation

    Get PDF
    © 2021 The Authors. Published by ACL. This is an open access article available under a Creative Commons licence. The published version can be accessed at the following link on the publisher’s website: https://aclanthology.org/2021.findings-acl.86While word segmentation is a solved problem in many languages, it is still a challenge in continuous-script or low-resource languages. Like other NLP tasks, word segmentation is domain-dependent, which can be a challenge in low-resource languages like Thai and Urdu since there can be domains with insufficient data. This investigation proposes a new solution to adapt an existing domaingeneric model to a target domain, as well as a data augmentation technique to combat the low-resource problems. In addition to domain adaptation, we also propose a framework to handle out-of-domain inputs using an ensemble of domain-specific models called MultiDomain Ensemble (MDE). To assess the effectiveness of the proposed solutions, we conducted extensive experiments on domain adaptation and out-of-domain scenarios. Moreover, we also proposed a multiple task dataset for Thai text processing, including word segmentation. For domain adaptation, we compared our solution to the state-of-the-art Thai word segmentation (TWS) method and obtained improvements from 93.47% to 98.48% at the character level and 84.03% to 96.75% at the word level. For out-of-domain scenarios, our MDE method significantly outperformed the state-of-the-art TWS and multi-criteria methods. Furthermore, to demonstrate our method’s generalizability, we also applied our MDE framework to other languages, namely Chinese, Japanese, and Urdu, and obtained improvements similar to Thai’s
    corecore